Acoustic-based recognition of head gestures accompanying speech
نویسندگان
چکیده
Head movements are linked not only to symbolic gestures, such as head-nodding to represent “yes” or head-shaking to represent “no,” but also to the production of suprasegmental features of speech, such as stress, prominence, and other aspects of prosody. Recent studies have shown that head movements play a more direct role in the perception of speech. In this paper, we propose a novel method for recognizing head gestures that accompany speech. The proposed method tracks head movements that accompany speech by localizing the mouth position with a microphone array system. The proposed system is based only on acoustic information and never utilizes visual information. We also propose a recognition method for the mouth-position trajectory, in which HigherOrder Local Cross Correlation is applied to the trajectory. The recognition accuracy of the proposed method was on an average 90.25% for nineteen kinds of head gesture recognition tasks conducted in an open test manner, which outperformed the Hidden Markov Model-based method.
منابع مشابه
Powered Wheelchair Control Using Acoustic-Based Recognition of Head Gesture Accompanying Speech
In this paper, we propose the novel interface for powered wheelchair control using the acoustic-based recognition of head gesture accompanying speech. A microphone array mounted on a wheelchair localizes the position of the user’s voice. Because the localized position of the user’s voice almost corresponds with that of the mouth, the tracking of the head movements accompanying speech can be ach...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملImprovement of multimodal gesture and speech recognition performance using time intervals between gestures and accompanying speech
We propose an integrative method of recognizing gestures such as pointing, accompanying speech. Speech generated simultaneously with gestures can assist in the recognition of gestures, and since this occurs in a complementary manner, gestures can also assist in the recognition of speech. Our integrative recognition method uses a probability distribution which expresses the distribution of the t...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملMental Timeline in Persian Speakers’ Co-speech Gestures based on Lakoff and Johnson’s Conceptual Metaphor Theory
One of the introduced conceptual metaphors is the metaphor of "time as space". Time as an abstract concept is conceptualized by a concrete concept like space. This conceptualization of time is also reflected in co-speech gestures. In this research, we try to find out what dimension and direction the mental timeline has in co-speech gestures and under the influence of which one of the metaphoric...
متن کامل